pyAffy: An efficient Python/Cython
نویسنده
چکیده
10 Robust multi-array average (RMA) is a highly successful method for processing raw data from Affymetrix expression microarrays. However, most of the work on microarray data processing predates the widespread use of Python in scientific computing. Here, I describe pyAffy, an efficient implementation of the RMA method in Python/Cython. Using data from the MAQC project, I show that this implementation produces virtually identical results compared to the RMA reference implementation in the affy R package, while running more than five times faster and consuming significantly less memory. I also show how individual steps of the RMA method affect the final expression estimates. The source code for pyAffy is available from PyPI and GitHub (https://github.com/flo-compbio/pyaffy) under an OSI-approved license. I intend to periodically revise this manuscript to ensure that it accurately reflects the functionalities available in the pyAffy Python package. 11
منابع مشابه
The XL-mHG test for gene set enrichment
7 The nonparametric minimum hypergeometric (mHG) test is a popular alternative to Kolmogorov-Smirnov (KS)-type tests for determining gene set enrichment. However, these approaches have not been compared to each other in a quantitative manner. Here, I first perform a simulation study to show that the mHG test is significantly more powerful than the one-sided KS test for detecting gene set enrich...
متن کاملPyFAI: a Python library for high performance azimuthal integration on GPU
The pyFAI package has been designed to reduce X-ray diffraction images into powder diffraction curves to be further processed by scientists. This contribution describes how to convert an image into a radial profile using the Numpy package, how the process was accelerated using Cython. The algorithm was parallelised, needing a complete re-design to benefit from massively parallel devices like gr...
متن کاملCyNEST: a maintainable Cython-based interface for the NEST simulator
NEST is a simulator for large-scale networks of spiking point neuron models (Gewaltig and Diesmann, 2007). Originally, simulations were controlled via the Simulation Language Interpreter (SLI), a built-in scripting facility implementing a language derived from PostScript (Adobe Systems, Inc., 1999). The introduction of PyNEST (Eppler et al., 2008), the Python interface for NEST, enabled users t...
متن کاملA Comparison of Five Programming Languages in a Graph Clustering Scenario
The recent rise of social networks fuels the demand for efficient social web services, whose performance strongly benefits from the availability of fast graph clustering algorithms. Choosing a programming language heavily affects multiple aspects in this domain, such as runtime performance, code size, maintainability and tool support. Thus, an impartial comparison can provide valuable insights ...
متن کاملPerformance of Python runtimes on a non-numeric scientific code
The Python library FatGHol [FatGHoL] used in [Murri2012] to reckon the rational homology of the moduli space of Riemann surfaces is an example of a non-numeric scientific code: most of the processing it does is generating graphs (represented by complex Python objects) and computing their isomorphisms (a triple of Python lists; again a nested data structure). These operations are repeated many t...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
عنوان ژورنال:
دوره شماره
صفحات -
تاریخ انتشار 2016